[[vanishing_gradient_problem|vanishing gradient problem]]

📚 node [[vanishing_gradient_problem|vanishing gradient problem]]

Welcome! Nobody has contributed anything to 'vanishing_gradient_problem|vanishing gradient problem' yet. You can:

Write something in the document below!
- There is at least one public document in every node in the Agora. Whatever you write in it will be integrated and made available for the next visitor to read and edit.
Write to the Agora from social media.
- If you follow Agora bot on a supported platform and include the wikilink [[vanishing_gradient_problem|vanishing gradient problem]] in a post, the Agora will link it here and optionally integrate your writing.
Sign up as a full Agora user.
- As a full user you will be able to contribute your personal notes and resources directly to this knowledge commons. Some setup required :)

⥅ related node [[vanishing_gradient_problem]]

⥅ node [[vanishing_gradient_problem]] pulled by Agora

📓 garden/KGBicheno/Artificial Intelligence/Introduction to AI/Week 3 - Introduction/Definitions/Vanishing_Gradient_Problem.md by @KGBicheno

vanishing gradient problem

Go back to the [[AI Glossary]]

#seq

The tendency for the gradients of early hidden layers of some deep neural networks to become surprisingly flat (low). Increasingly lower gradients result in increasingly smaller changes to the weights on nodes in a deep neural network, leading to little or no learning. Models suffering from the vanishing gradient problem become difficult or impossible to train. Long Short-Term Memory cells address this issue.

Compare to exploding gradient problem.

W

📖 stoas

public document at doc.anagora.org/vanishing_gradient_problem|vanishing-gradient-problem
video call at meet.jit.si/vanishing_gradient_problem|vanishing-gradient-problem

⥱ context

← back
ai glossary

↑ pushing here
(none)

↓ pulling this
(none)

→ forward
(none)

🔎 full text search for 'vanishing_gradient_problem|vanishing gradient problem'